Solver for hardware based computing

ABSTRACT

Full-AC load flow constitutes a core computation in power system analysis. The present invention provides a performance gain with a hardware implementation of a sparse-linear solver using a Field Programmable Gate Array (FPGA). The invention also relates to the design, simulation, and hardware verification of a static transmission line model for analog power flow computation. Operational transconductance amplifiers are employed in the model based on a previously proposed DC emulation technique of power flow computation, and provide reconfigurability of transmission line parameters via transconductance gain. The invention also uses Analog Behavioral Models (ABMs) in an efficient strategy for designing analog emulation engines for large-scale power system computation. Results of PSpice simulations of these emulation circuits are compared with industrial grade numerical simulations for validation. The application is also concerned with the development of a generator model using analog circuits for load flow emulation for power system analysis to reduce computation time. The generator model includes reconfigurable parameters using operational transconductance amplifiers (OTAs). The circuit module is used with other reconfigurable circuits, i.e., transmission lines and loads.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The current invention is related to a programmable linear algebraicsolver for large sparse matrices based on a lower-upper triangulardecomposition method to be used in a hardware computing environment.Furthermore, the present invention also relates to models for powersystem components and power systems.

2. Brief Description of the Prior Art

Currently, power flow computation for large power systems is timeintensive. The calculations are non-linear in nature and lengthyiteration schemes are the currently preferred solution. This presents aproblem as many assumptions and simplifications are required to solvethe equations in a timely manner. In addition, expansion of the powergrid, increasing necessity and complexity of contingency studies andintroduction of economic analysis are demanding further computationalburden. Traditional digital methods are too slow to solve theaforementioned demands quickly. This affects the security, reliabilityand market operation of power systems. Ideally a real-time computationtool is preferable, specifically in market activities and operation.

In power flow computations, sparse linear solvers are used to calculatethe update vector for the Jacobian matrix at each iterative step duringNewton-Raphson iteration for load flow solution. The bottleneck inNewton power flow iteration is the solution of a sparse system of linearequations, which is required for each iteration, and typically consumes85% of the execution time in large-scale power systems [1]. In practice,many related load flow computations are performed as part of contingencyanalysis. These multiple load flow computations can easily be performedin parallel. It remains desirable to speed up the computation of anindividual load flow computation. Typical power flow computation isperformed using general-purpose personal computing platforms based onprocessors such as the Intel Pentium IV®. High performance uniprocessorsparse linear solvers on these platforms utilize only roughly 1-4% ofthe floating-point throughput. Attempts have also been made to increaseperformance through the use of cluster computing [1], however, due tothe irregular data flow and small granularity of the problem thesemethods do not scale very well.

Analog computation provides a viable alternative to traditionalapproaches. Among the advantages over traditional digital methods arephysically realizable solutions and faster computation times. In orderto make the analog method a viable tool in power system analysis,accurate models of power system components are required. Areconfigurable transmission line model is considered for a specificanalog computation method. Traditional digital methods are expensive andslow in comparison to analog computers. The power flow solution isobtained almost instantaneously regardless of the number of componentsin the network with analog circuits. Essentially computation time isindependent of network size. Effectively the solution is obtained asquickly as the system stabilizes. Experimentation has shown the abilityto calculate solutions even faster than real time. In prior research,simulation time for a two-machine system was typically 104 times shorterthan the real time simulated phenomena [101]. This is achieved bymodeling generator dynamics for the purpose of transient stabilityevaluation.

The modeling and design of analog components is one obstacle that mustbe overcome. The solutions will only be as accurate as the models andmeasurements will allow. Without clearly defined valid models for powersystem components, the analog computational method will never berealized. In addition, these models also must cater towardscomputational speed. Specifically, in power system operation, multipleruns and contingencies are required to be executed extremely quickly. Apriority for these analog models is to yield valid solutions whileallowing fast reconfigurability.

An efficient strategy for designing analog emulation engines for largescale power system computation may be provided through the use of AnalogBehavioral Models (ABMs) of PSpice [201]. Emulation can be described asan act of a physical system imitating a real system. Emulation in thispatent application, therefore, is the representation of physicalcharacteristics of a real life object (power system) using an electriccircuit equivalent. The representational relationship could bemathematical, scaled, or both. The circuit equivalent representation haswithin it the model of a real system as well as a method of itssolution. The speed of computation is as quick as the response of thecircuit itself, which could be real time, faster or slower than realtime depending on the parameter settings. The solution is continuous intime and amplitude.

Simulation on the other hand is an attempt to predict/replicate aspectsof the behavior of a real system by creating an approximate(mathematical) model of it. Computer modeling, for example by providinga special-purpose computer program, may do this. The program is composedof equations that describe the functional relationships within the realsystem. When the program is run, the resulting mathematical dynamicsform an analog of the behavior of the real system with the resultspresented in the form of data.

The need to emulate/simulate large and complex mixed-signal systems hasprompted the development of high-level circuit representations foranalog components; ABMs serve this purpose. The interior of a behavioralmodel, however, is different in that it is implemented in terms ofalgebraic or differential equations father than physical analogcomponents. In order words, in a behavioral model, the focus is on theinput/output relationship of the block. The fundamental advantage of thebehavioral modeling technique in top-down designs is that the simulationcan provide fast prediction of system performance. The approach helps toselect proper architectures for circuit implementation and analyzetradeoffs at the early design stages. The transistor level simulation(bottom-up design), comparatively, can be very tedious and cumbersomeespecially for mixed-signal chips containing a large number of analogcomponents. Under such circumstances, behavioral models enable designersto verify the complex system efficiently and result in fast systemevaluation prior to embarking on full structural design andimplementation.

Static load flow analysis, which is based on the power flow equations,is one method currently used by the industry. The problem is that suchanalyses solve using a sequential method, which makes the simulationprocess very slow in complex networks. The speed of the digital computerto solve for static load flow is greatly dependent on the size of thepower system network. An alternative method to solve static load flow isusing an analog computer to emulate the behavior of the power system,instead of using model equations to simulate the network.

In the past two decades, silicon-on-insulator (SOI) complementary metaloxide semiconductor (CMOS) technology has become a major technology forintegrating VLSI systems using a low-supply voltage [301]. Along withthe progress in the CMOS technology, CMOS devices have been scaled downcontinuously and the corresponding power consumption has also beendecreased, which have triggered advances in the circuit designtechniques for various designs and applications. The association betweenCMOS circuits and VLSI chips is becoming increasingly easier toimplement.

With these advances, Fried et al. [302], proposed an approach using ananalog VLSI chip for simulation of the behavior of power systems. Usingan analog VLSI chip can be limited if the fabricated chip is only usefulfor one power system configuration. Gu et al. [303], presented a conceptto add reconfigurable parameters using switches and analog voltages tochange the system configuration and parameters.

SUMMARY OF THE INVENTION

According to one aspect of the current invention, the system provides asubstantially improved computing platform for solving computationintensive problems. One implementation includes Field Programmable GateArrays (FPGA) for the solution of sparse linear systems.

According to a second aspect of the current invention, the system isimplemented by a combination of downloadable core software and adedicated hardware such as FPGA. The core software customizes theapplication of the FPGA for various computational analyses.

According to a third aspect of the current invention, a statictransmission line is modeled for analog power flow computation.Operational transconductance amplifiers in the model providereconfigurability of transmission line parameters via transconductancegain.

According to a fourth aspect of the current invention, Analog BehavioralModels (ABMs) are implemented for use in analog emulation engines forlarge-scale power system computation. ABMs are applied for modelverification and validation prior to full structural design andimplementation.

According to a fifth aspect of the current invention, a generator modelusing analog circuits is developed for load flow emulation for powersystem analysis to reduce computation time. The generator model mayinclude reconfigurable parameters using operational transconductanceamplifiers (OTAs). The circuit module is used with other reconfigurablecircuits, i.e., transmission lines and loads.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram illustrating SYMAMD elimination row histogram.

FIG. 2 is a diagram illustrating the architecture of one preferredembodiment according to the current invention.

FIG. 3 is a diagram illustrating the pivot search logic block of onepreferred embodiment according to the current invention.

FIG. 4 is a diagram illustrating the update pivot column/row read logicblock of one preferred embodiment according to the current invention.

FIG. 5 is a diagram illustrating the update unit of one preferredembodiment according to the current invention.

FIG. 6 is a graph illustrating the relation between the increased speedand the number of update units according to the current invention.

FIG. 7 is a graph illustrating the relative solve times for a singleupdate unit according to the current invention.

FIG. 8 is a graph illustrating the relative solve times for sixteenupdate units according to the current invention.

FIG. 9 is a graph illustrating the LU hardware performance relative to aPentium IV® processor.

FIG. 10 is a circuit diagram illustrating a DC network emulation for athree-bus power system.

FIG. 11 is a diagram illustrating a lossy transmission line model.

FIG. 12 is a diagram illustrating a lossless transmission line model.

FIG. 13 is a diagram illustrating an ideal operational transconductanceamplifier (OTA).

FIG. 14 is a diagram illustrating a double ended OTA transmission linemodel.

FIG. 15 is a diagram illustrating a potentiometer relationship to OTAmodel.

FIG. 16 is a diagram illustrating PSpice schematic for bias currentcontrol.

FIG. 17 is a graph plotting bias current vs. bias voltage.

FIG. 18 is a graph plotting theoretical and simulated resistance vs.bias voltage.

FIG. 19 is a graph plotting line current vs. line voltage.

FIG. 20 is a graph plotting line current vs. line voltage linear region.

FIG. 21 is a graph illustrating effective resistance of line model.

FIG. 22 is a graph plotting effective resistance of line model vs. linevoltage.

FIG. 23 is a graph plotting hardware and spice effective resistance vs.bias voltage.

FIG. 24 is a graph plotting HW line current vs. line voltage for variousbias currents.

FIG. 25 is a graph illustrating line current linearity.

FIG. 26 is a graph plotting HW resistance vs. line voltage.

FIG. 27 is a diagram illustrating a three-bus power system topology.

FIG. 28 is a diagram illustrating a three-bus prototype board.

FIG. 29 is a diagram illustrating an emulation circuit setup.

FIG. 30 is a diagram illustrating a power world power flow solution.

FIG. 31 is a diagram illustrating analog and ABM sine shaper.

FIG. 32 is a time line of hardware technology.

FIG. 33 illustrates complex computation methodology.

FIG. 34 is a diagram illustrating a complex computation implementation.

FIG. 35 is a diagram illustrating generator real power computationmethodology.

FIG. 36 is a diagram illustrating computationally feasible bus voltage.

FIG. 37 illustrates variable representation in analog emulator.

FIG. 38 is a diagram illustrating a power system emulator hardwaresetup.

FIG. 39 is a diagram illustrating a classical generator model.

FIG. 40 is a block diagram illustrating a swing equation functionalblock.

FIG. 41 is a block diagram illustrating generator model functions.

FIG. 42 is a diagram illustrating a single-machine CMON analog emulator.

FIGS. 43 a-b show diagrams illustrating results of power system andanalog emulator.

FIG. 44 is a diagram illustrating results of analog emulator with offsetcorrections.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

In the current invention, application-specific hardware is employed toreduce the computation time for the solution of the sparse linearsystems arising in complex computations. The use of special-purposehardware reduces the overhead costs, better utilizes floating-pointhardware, and provides fine-grained parallelism. The use of FieldProgrammable Gate Arrays (FPGA) provides a convenient platform to designand implement such hardware. By building hardware that is specificallydesigned to solve the sparse matrices found in complex calculations,such as power system calculations, rather than utilizing general-purposeprocessors and parallel processing, the present invention significantlyimproves the efficiency of the linear solver and hence reduces thecomputing time compared to traditional platforms.

Thus, one aspect of the present invention relates to hardware for thedirect solution of sparse linear systems. The hardware design takesadvantage of properties of the matrices arising in the specificcomputation obtained from a detailed analysis of actual systems.

Overview of FPGA Technology

Field programmable gate arrays (FPGA) are reconfigurable logic devicesthat allow users to configure device operation through software. Arraysof programmable logic blocks, combined with programmable I/O blocks, andprogrammable interconnects form the underlying reconfigurable fabric forFPGAs. Software code, typically written in a hardware descriptionlanguage (HDL) such as VHDL or Verilog, is mapped to these devicesallowing the user to specify the functionality of the FPGA. FPGAs can beprogrammed multiple times, unlike other programmable logic devices(PLD), which can only be programmed once. In addition, unlikeapplication specific integrated circuits (ASIC), there is not a longlead time in between design and testing since the designer does not haveto wait for masks and then for manufacturing to return a part for test.The user simply compiles the design for the desired FPGA and thenprograms the device.

FPGA device densities are in the millions for logic gates with clockrates of 500 MHz for synthesized logic (see www.xilinx.com). Beyondincreases in logic density, the addition of high performance embeddedarithmetic units, large amounts of embedded memory, high speed embeddedprocessor cores, and high speed I/O has allowed FPGAs to grow beyondjust simple prototyping devices. FPGAs can outperform high-end personalcomputers [3]. Based on synthesis results, a high-end FPGA such as aXilinx Virtex-4™ contains enough embedded multiplier and logic resourcesto support over 40 high-speed single precision floating-pointmultipliers operating at 300 MHz. Unlike personal computer processors,FPGAs have the advantage of being able to be mapped to the desiredapplication, rather than being designed to perform well across a widerange of applications. This allows the greatest amount of utilization ofthe available resources on the FPGA.

Overview of Power Flow by Newton's Method

The present invention will be exemplified based on power flowcomputation. However, the present invention is applicable to other typesof complex computations as well.

Power flow solution via Newton's method [4] involves iterating thefollowing equation:

−JΔx=f(x)  (1)

Until f(x)=0 is satisfied. The Jacobian, J, of the power system, is alarge highly sparse matrix (very few non-zero entries), which while notsymmetric has a symmetric pattern of non-zero elements. Δx is a vectorof the change in the voltage magnitude and phase angle for the currentiteration. And f(x) is a vector representing the real and imaginarypower mismatch. The above set of equations are of the form Ax=B, whichcan be solved using a direct linear solver. Direct linear solversperform Gaussian Elimination, which is performed by decomposing tithematrix A into Lower (L) and Upper (U) triangular factors, followed byforward and backward elimination to solve for the unknowns.

Dense matrix linear solvers are not as efficient as specialized solversdesigned for sparse matrices. These solvers do not compute the non-zeroentries. However, they suffer from a phenomenon known as fill-in.Fill-in of a sparse matrix during LU decomposition arises when apreviously zero entry in the matrix becomes non-zero during theelimination process. Large amounts of fill-in will degrade theperformance of sparse solvers by increasing the number of mathematicaloperations required for the LU decomposition. By using an appropriateelimination order, the amount of fill-in can be dramatically reduced[5]. Selecting the optimal ordering is an NP complete problem [6].However, effective heuristics are known that work quite well for thematrices in load flow computation. In addition, sparse matrix solvers donot share the regular computational pattern of dense matrix solvers andas a consequence there is significant overhead in accessing andmaintaining data structures that utilize the sparse data.

Benchmark System and Properties

An in-depth analysis of the elimination process was performed using fourpower systems of various sizes. Two of the systems, the 1648 Bus and7917 Bus systems, were obtained from the PSS/E data collection, and theother two systems, the 10279 Bus and 26829 Bus systems, were obtainedfrom power systems used by PJM Interconnect. The purpose of the analysiswas to obtain features of the systems arising in typical power systemcomputation that could be utilized in the design of special-purposehardware. In addition, the systems were also used to benchmark theproposed hardware and compare it against a state-of-the art softwaresolution. Table 1 summarizes the number of branches and number ofnon-zeros (NNZ) in the Ybus matrices and the size and number ofnon-zeros of typical Jacobian matrices for the benchmark systems.

TABLE 1 Summary of Power System Matrices System Branches NNZ YBUS NNZJacobian Jacobian Size 1648 Bus 2,602 6,680 21,196 2,982 7917 Bus 13,01432,211 105,522 14,508 10279 Bus 14,571 37,755 134,621 19,285 26829 Bus38,238 99,225 351,200 50,092

Operation Count (Opcounts) and Fill-in Study

The number of floating-point subtractions (FSUB), multiplications(FMUL), divisions (FDIV), and the growth of the matrices as a result offill-in were collected for the Jacobians and are summarized in Tables 2and 3, This information was used to project performance and estimatestorage requirements for the resulting L and U factors. The MATLABfunctions COLAMD and SYMAMD were used to determine elimination ordering.The number of fill-ins is equal to the number of non-zeros in the L andU factors compared to the Jacobian. SYMAMD utilizes the symmetricstructure of the non-zeros in the Jacobian and provides a substantialreduction in fill-in and arithmetic operations.

TABLE 2 COLAMD NNZ and MFLOP Count System NNZ LU # FDIV # FMUL # FSUB1648 Bus 82,378 0.039 1.088 1.027 7917 Bus 468,757 0.225 9.405 9.04110279 Bus 450,941 0.206 6.267 5.951 26829 Bus 1,424,023 0.661 40.11039.037

TABLE 3 SYMAMD NNZ and MFLOP Count System NNZ LU # FDIV # FMUL # FSUB1648 Bus 45,595 0.021 0.268 0.243 7917 Bus 236,705 0.110 1.737 1.60610279 Bus 293,688 0.130 1.976 1.817 26829 Bus 868,883 0.392 11.22310.705

Given a floating point throughput limited calculation, the performanceis proportional to the number of floating point operations (FLOPs)divided by the throughput of the floating-point hardware. Based on theflop counts and a throughput of 200 million floating point operationsper second (MFLOPs), the projected solve time for the largest system isapproximately 0.11 seconds, solved using the SYMAMD pre-permutation.

The storage requirement for the L and U factors is proportional to thenumber of non-zero elements for a non-restoring design, where theoriginal inputs are overwritten by the resulting solution. A sparsematrix storage scheme is much more space efficient than dense storagewhen dealing with matrices of large size, but very few non-zero values.Row compressed storage is a common method used by sparse solvers [7].This scheme stores the non-zero values of a sparse matrix by a set ofrow vectors, one vector containing non-zero values and the othercontaining the corresponding column indices. Assuming 32-bit indices and32-bit values for the matrix entries, the amount of storage required tostore the L and U factors is approximately 7 MB for the largest testsystem, solved using the SYMAMD pre-permutation. This amount of storageis well within the space available on off the shelf memory chips(www.micron.com).

Elimination Details

The pattern of non-zero elements that arise during the eliminationprocess was analyzed in order to estimate the available parallelism,intermediate storage requirements, and potential for data reuse.

Table 4 shows the average number of non-zero elements in the rows andcolumns of the matrix as the elimination process proceeds. It is used asan estimate for the amount of row or column parallelism (separate rowscan be updated in parallel) that can be achieved by utilizing multipleFPUs. Observe that the reduced fill-in obtained by SYMAMD leads toreduced parallelism. The amount of resources on an FPGA can support thenumber of floating-point units required to achieve maximal parallelism.

FIG. 1 shows the distribution of the number of non-zero entries in therows that occur during the LU factorization for the 1648 Bus system. Thex-axis is the number of non-zeros and the vertical bars indicate thenumber of rows with that many non-zero entries. The bulk of theelimination rows are at or below the average. This histogram isrepresentative of all of the row and column results for both COLAMD andSYMAMD across all four benchmark systems.

The maximum number of non-zero elements in a row determines the size ofbuffers that store rows. Table 5 shows the maximum NNZ in a row for theentire row as well as the portion of the row used during elimination(Table 5). Using 32-bit indices and 32-bit floating point values, 19Kbits of storage is required for the largest elimination row of the 26829Bus system. Today's FPGAs contain thousands of Kbits of high speedembedded memory, which is enough to buffer hundreds of rows or columns.

TABLE 4 Mean elimination row and column size COLAMD SYMAMD System RowCol Row Col 1648 Bus 13.4 14.2 7.2 8.1 7917 Bus 15.8 16.5 7.8 8.6 10279Bus 11.7 11.7 7.5 7.80 26829 Bus 14.2 14.2 8.5 8.9

TABLE 5 Maximum elimination row and matrix row size COLAMD SYMAMD SystemElim. Matrix Elim. Matrix 1648 Bus 90 231 66 218 7917 Bus 147 477 125275 10279 Bus 116 416 186 519 26829 Bus 258 886 293 865

The amount of reuse between subsequent elimination rows impacts thepotential efficiency of a memory hierarchy. Rows that are reused fromone iteration to the next can be cached and a row not used in theprevious iteration can be prefetched. Rows reused in subsequenteliminations do not allow pivot search to occur in parallel withcomputation during elimination since some values being calculated arenecessary for the next pivot search. Table 6 shows that slightly morethan half of the rows are reused between subsequent iterations.

TABLE 6 Mean Percentage of Non-Zero Row Elements Reused betweenSuccessive Elimination Rows System COLAMD SYMAMD 1648 Bus 64% 53% 7917Bus 64% 54% 10279 Bus 60% 54% 26829 Bus 60% 54%

Software Performance

The comparison system used in the benchmarks is a 2.60 GHz Pentium IV®computer running Mandrake Linux 9.2™. Software was compiled using gccv3.3.1 (flags-O3-fpIC) and utilizing ATLAS CBLAS libraries. Threehigh-end sparse solvers were tested, UMFPACK (Un-Symmetric Multi-FrontalPackage) [8], SuperLU (http://www.cs.berkeley.edu/˜demmel/SuperLU.html),and WSMP (Watson Sparse Matrix Package:http://www-users.cs.umn.edu/˜agupta/wsmp.html). Of the three solvers,UMFPACK gave the best results on all four systems. The numerical solvetime, floating point operations per second, and floating pointefficiency of the comparison system were gathered using the power systemmatrices. Despite the 2.6 GHz clock speed and impressive computationalpower of the Pentium IV® benchmark system, even the highest performancesoftware was only able to utilize a fraction of the total availablefloating-point resources.

TABLE 7 Comparison System Benchmarks Mflops re- Efficiency % DataUMFPACK Numeric ported by (obtained flop * Name Factor time (s) UMFPACK100/peak flop) 1648Bus 0.03 27.42 1.05 7917Bus 0.13 34.69 1.33 10278Bus0.16 24.9 0.96 26829Bus 0.49 89.74 3.45

Hardware Design

The proposed hardware implements the Gaussian elimination with partialpivoting LU factorization algorithm shown below:

For each column in the matrix loop   Search column for pivot element  Pivot if necessary   Normalize pivot column   Update remaining portionof matrix:     Multiply to form sub-matrix update     Update value orcreate fill-in End loop

In order to maximize performance, the design of the LU hardware focusedon maintaining regular computation and memory access patterns that areparallelized wherever possible. FIG. 2 shows a block diagram of theoverall design of one preferred embodiment according to the currentinvention. The Pivot Search Logic, Update Pivot Column/Row Read Logic,and Update Unit Logic blocks each perform a portion of the overall LUalgorithm, with control and memory access handled by the other twoblocks on the FPGA. The memory hierarchy to which the processing logicconnects, consists of a write-back row-cache implemented in a fastrandom-access memory such as SRAM, as well as a larger block of SDRAMwhich is used to store the matrix rows which do not fit into the rowcache. Our design makes full use of Dual-Ported SRAM memories whichallow memory reads and writes to occur simultaneously. An additionalbank of random-access memory is also required to store the colmap. Thecolmap is a redundant column-wise array of the indices of non-zeroelements in the matrix. The purpose of the colmap is to increase theefficiency of the pivot search.

The minimum size of the row cache is equal to the amount of memoryrequired to store the largest sub-matrix update, which is proportionalto the product of the maximum number of non-zero elements in anelimination row and the maximum number of non-zero elements in anelimination column. A cache of this size insures that all row writes fora given sub-matrix update will be cache hits and will occur at cachespeed, since all row misses will only occur during pivot search insuringrow availability during sub-matrix update. The parameters to calculatethe amount of cache required was extracted as previously described underElimination Details.

For the initial design, the cache row size, SDRAM row size, and colmaprow size exceed the maximum counts, which were extracted during theinitial matrix profiling. This insures that the memories will haveenough space to store the worst case, however, this also means thatthere will be inefficiencies in the storage for the typical case.

Not shown in the hardware diagrams are the facilities required tomaintain row pivoting information, which is implemented by lookup tablesand memory pointers, which keep track of the row to memory mappings forthe compressed row storage format. An additional table is used to keeptrack of elimination progress to avoid reading row values that are notpart of the sub-matrix update, i.e., elements that have already beeneliminated. The size of these lookup tables scales linearly with matrixsize. Depending on the size of the matrices to be solved and theavailable on-chip resources, these tables can be stored in embeddedmemory on-chip or if they are larger, in off-chip memory.

Pivot Search Logic

Now referring to FIG. 3, the pivot search logic block performs the pivotsearch prior to each sub-matrix elimination step. A memory read of thecolumn map followed by indexing logic creates the indices for the pivotcolumn. The column map stores the physical address of rows in aparticular column. The physical address must then be translated to thecorresponding matrix row number. Since the colmap may contain rows whichhave already been eliminated, index rejection logic is required toselect the appropriate indices. The physical addresses corresponding tothese indices are then used to access the values from memory. The valuesare compared sequentially as they arrive to obtain the absolute maximumvalue and index. This value is stored in a register as the pivotelement. The physical indices, virtual indices, and floating-pointvalues for the pivot column are also stored in registers to be used bythe pivot column update logic. The minimum amount of memory required forthese registers is equal to the product of the largest eliminationcolumn in the matrix and the amount of storage for two indices and onevalue. After pivoting is complete, a background update is sent to lookuptables (not shown) to maintain the correct row and address mappings forfuture translations.

Update Pivot Column/Row Read Logic

Now referring to FIG. 4, the update pivot column/row read logic blockperforms normalization prior to elimination in parallel with the pivotrow requested from memory. The pivot index, pivot value, and pivotcolumn are from the pivot search logic (FIG. 3). After the pivot row iscompletely read from memory into buffers, and at least one of the updatecolumn entries has been created the next part of the elimination logiccan proceed. Since the design utilizes a floating-point divider with apipeline rate of 1 per clock cycle, all of the remaining divides willcomplete before they are requested by the update unit logic. The updaterow indices and values, and the normalized update column indices(physical and virtual) and values are stored in registers. The minimumsize of this register space is equal to the product of the maximumelimination row in the matrix and the amount of space required to storethree indices and two values.

Update Unit Logic

Now referring to FIG. 5, the update unit logic performs the remainingcomputations required during the elimination step. In this design, theprocessing logic can support multiple update units operating inparallel. Each update unit is tasked with completing the elimination fora single row of the sub-matrix to be updated. Each update unit containsa floating-point multiplier and adder to perform the required arithmeticas well as other logic that operates on the column indices. Thecomputational logic operates in parallel with the memory and indexinglogic to maximize utilization of all of the logic units. The number ofupdate units that can fit on the FPGA is limited by the amount ofon-chip embedded memory available, the number of multipliers available,and also the available logic resources. A high end FPGA has enoughresources to support over 16 of these units.

The merge logic maintains the row order and also determines if themultiplier output is a fill-in, an update, and also determines if thesub-matrix row element should be copied into the new row. If an entry isa fill-in, the colmap table is updated in the background to reflect thenewly added index. Updates that result in a zero value are not prunedfrom the colmap and they are also stored in the row result. All logic inthe update unit is pipelined for maximum throughput. Once the front endof the update unit has completed processing, it can request the nextavailable row to be operated on in order to eliminate idle logic cycles.After all of the pending row eliminations for the current sub-matrixupdate have been completed, the next pivot search can begin.

Hardware Performance Model

A program was written in C, to project the performance of the hardwaredesign. This program performs the elimination in the same fashion as thehardware would, and operates on an input file that contains the powersystem Jacobian matrix. The number of clock cycles the hardware requiresto obtain the L and U factors is accumulated based on the latency andthroughput of the processing logic and memory hierarchy. This cyclecount allows us to compute the computation time based on clock speed.

TABLE 8 Solve time for LU Hardware Update Units Solve Time (s) System 12 4 8 16  1648 Bus 0.0041 0.0028 0.0021 0.0018 0.0017  7917 Bus 0.02430.0157 0.0116 0.0097 0.0090 10279 Bus 0.0311 0.0202 0.0149 0.0126 0.011726829 Bus 0.1378 0.0817 0.0543 0.0415 0.0359

The memory model used for accessing SDRAM utilized three parameters tomodel the average behavior of a SDRAM memory: verall latency, burst readlength, and latency between bursts. For the SRAM and embedded memoriesthe model parameter was latency from request to data output. The onlyparameter for the floating-point hardware is the pipeline length, whichaffects the latency of results. All units were assumed to have apipeline rate of 1 per clock cycle, even the floating-point divisionhardware. Extending the model to multiple instances of update unit logicwas done by simulating round-robin service for each update unit by thememory hierarchy assuming a proportional increase in bandwidth. Theclock rate for memory and logic was assumed to be synchronous at 200MHz. This system clock rate was chosen because this was the maximumclock rate reported by VHDL synthesis for our floating-point units.

Table 8 shows the projected performance of the proposed LU hardwarearchitecture based on the four test power systems using the SYMAMDpre-permutation at a clock rate of 200 MHz. The simulated hardwareperformance achieves approximately 80% of the performance predicted byopcount. The projected performance using COLAMD (not shown) is less thanSYMAMD, regardless of the number of update units used. FIG. 6 shows therelative speedup of the LU hardware by increasing the number of updateunits. Scaling to even four update units does not net an equivalentspeedup in overall solve time. The reason for this is apparent in FIGS.7 and 8.

FIGS. 7 and 8 show the relative time spent in the three portions of theLU hardware solution. For the single update unit configuration (FIG. 7),the majority of the time is spent performing the matrix update. Roughlya quarter of the time is spent searching for the pivot value. A FIG. 8show that as the number of update units increases, the amount of timespent during pivot searching begins to dominate the overall solve time.

FIG. 9 shows the predicted performance ratio of performance of the LUhardware at 200 MHz to the Pentium IV® benchmark system. This datapredicts a speedup over the Pentium IV® of a factor of 10 with only 4update units and approximately 15 with 16 units.

These results show that special-purpose hardware implemented usingexisting FPGA provides a 10× or higher performance increase, compared tosoftware solutions using state-of-the-art sparse linear solvers, for theLU decompositions that arise in typical power flow computations. Thespeedup is obtained by utilizing special-purpose hardware to reduce theoverhead to access and maintain data structures for sparse matrices andthe use of fine-grained parallelism which performs multiple row updatesin parallel. The reduction in overhead dramatically improves theutilization of the floating-point units as compared to softwaresolutions on general-purpose processors.

A limiting factor is the time for pivot search. Eliminating or reducingthe pivot search time will result in additional gains in performance.Such improvement may be obtained by overlapping the next pivot searchwith the current update, however, this is limited by the amount of reusein rows from one elimination step to the next. The use of otherapproaches such as the Bordered Diagonal Block (BDB) technique mayprovide additional parallelism.

Analog Power Flow Method

A DC emulation power flow method has been proposed in [102] and isreviewed here for an understanding of the application of the line modelin this invention. This approach utilizes multiple DC networks as shownin FIG. 10. The power system is broken up into three main components:generators, transmission lines, and loads. The generators are shown asvoltage sources. This form of DC emulation is possible due to the factthat the emulation is executed in rectangular coordinates. Thegenerators excite the network with real (Re{E_(g)}) and imaginary(Im{E_(g)}) voltage components and the states (voltages and currents) ofthese networks provide the steady-state power flow solution. Thetransmission line components in this emulation scheme are purelyresistive and the imaginary resistor values are dependant upon operatingfrequency. By modeling as fixed resistors the frequency is assumed to beconstant.

The current flowing out of a given generator i is calculated using thefollowing equation [103]:

$\begin{matrix}{\begin{matrix}{I_{Gi} = {\sum\limits_{j = 1}^{n}{{Re}{\left\{ Y_{ij} \right\} \cdot {Re}}\left\{ E_{j} \right\}}}} & {{network}\mspace{14mu} 1} \\{{- {\sum\limits_{j = 1}^{n}{{Im}{\left\{ Y_{ij} \right\} \cdot {Im}}\left\{ E_{j} \right\}}}}\;} & {{network}\mspace{14mu} 2} \\{{+ j}{\sum\limits_{j = 1}^{n}{{Im}{\left\{ Y_{ij} \right\} \cdot {Re}}\left\{ E_{j} \right\}}}} & {{network}\mspace{14mu} 3} \\{{+ j}{\sum\limits_{j = 1}^{n}{{Re}{\left\{ Y_{ij} \right\} \cdot {Im}}\left\{ E_{j} \right\}}}} & {{network}\mspace{14mu} 4}\end{matrix}\mspace{14mu}} & (2)\end{matrix}$

This approach accurately models and calculates power flow for a lossytransmission network with the assumption that there is no frequencydeviation from nominal. A lossy transmission line model is emulatedcomposed of both reactive and resistive elements as shown in FIG. 11.

Relating the emulation circuit to the real world power system determinesthe real and imaginary resistor values in the DC networks. The followinggoverns this [104]:

$\begin{matrix}{R_{{Re}{({ij})}} = {\frac{1}{{Re}\left\{ Y_{ij} \right\}} = \frac{R_{ij}^{2} + X_{Lij}^{2}}{R_{ij}}}} & (3) \\{R_{{Im}{({ij})}} = {\frac{1}{{Im}\left\{ Y_{ij} \right\}} = \frac{R_{ij}^{2} + X_{Lij}^{2}}{X_{Lij}}}} & (4)\end{matrix}$

where R_(ij) and X_(ij) are the resistance and reactance of a linebetween buses i and j. The same approach is used for losslesstransmission lines. The transmission line resistances are neglected asshown in FIG. 12. The topology is similar to FIG. 10 with the exclusionof networks 1 and 4. Similarly, it is solved by omitting the excludednetworks.

The general layout for the DC emulation approach has been provided alongwith a relationship between AC lumped line models and the DC networks.With this information, an operational transconductance amplifier (OTA)based transmission line model with reconfigurability was developedexplicitly for the DC emulation power flow method.

Transmission Line Model

The transmission lines were modeled as resistors in the DC emulationshown previously for both lossy and lossless lines. Actual use ofresistors would require manual intervention to configure and alter theanalog system. In order to overcome this problem, a remotelyreconfigurable OTA based variable resistor was developed to model thelines.

The transconductance gain (g_(m)) of an OTA is controllable through anexternal source over a wide range, allowing remote reconfigurability. Inaddition, the basic OTA only consists of a few current mirrors resultingin a small and relatively low-cost device. They are also readily used inVLSI design. Due to its versatility, the OTA is one of the mostimportant building blocks of analog VLSI circuits [105] and can behighly accurate if used properly. Some of the pitfalls of OTAs are theirsensitivity to temperature and saturation effects but there are methodsto correct these [106, 107]. The principle operation of an OTA is avoltage controlled current source (VCCS). FIG. 13 shows an ideal OTA.

The amplifier has a differential voltage input and a current output. Thecurrent output is related to the differential input through atransconductance gain (g_(m)) controllable through an external source.Ideally the output would be:

i_(out)=g_(m)v_(in)  (5)

where v_(in) is the differential input voltage.

With VCCS operation, the OTA naturally lends itself to an application ofa variable resistor. The proposed OTA based variable resistor is shownin FIG. 14. It consists of two OTAs to address bi-directional currentflow. When current is flowing from left to right, the OTA on the rightis supplying the current, conversely if current is flowing in theopposite direction. The inputs are floating with respect to the powersupply and feedback is incorporated to extend the linear operating rangeof the device.

FIG. 14 illustrates a double-ended OTA transmission line model [108].This line model has three terminals similar to a variable resistor. Thetwo inputs (V₁, V₂) mimic terminals of a resistor and the i_(abc) is thebiasing current used to control the resistance of the model similar tothe wiper on a potentiometer. The terminal relationship to apotentiometer is shown in FIG. 15.

The effective resistance, Reff, of the OTA model is defined as theresistance seen between terminals V₁ and V₂. Reff is governed directlyby Ohm's law through the line voltage (V_(line)) and the line current(I_(line)).

$\begin{matrix}{R_{eff} = {\frac{{V_{1} - V_{2}}}{I} = \frac{V_{line}}{I_{line}}}} & (6)\end{matrix}$

This effective resistance is controlled by the bias current andapproximated by the following equation:

$\begin{matrix}{R_{eff} \approx \frac{2 \cdot R}{g_{m} \cdot R_{A}}} & (7)\end{matrix}$

The sizing of resistors R and R_(A) along with the range oftransconductance gain will determine the behavior of the circuit. Withappropriate resistors and wide control over gain a very large range ofeffective resistance is obtainable. The limitations of the circuit arerelated to properties of OTAs. The limitations, operation andcontrollability were analyzed through PSPice simulations and hardwaretesting.

Simulation

A LM13700 dual OTA was used for the simulations in PSPice. Varioussimulations were run to validate and characterize the line model.Controllability, resistance variance and circuit limitations wereanalyzed. The first simulation dealt with controlling the circuitthrough the bias current. Applying a current source is not as simple andeasily controllable in comparison to a voltage source. The simulationusing the circuit in FIG. 16 uses a voltage source (V_(abc)) behindimpedance (R_(abc)) to drive the bias current.

A voltage sweep was performed on V_(abc) and the corresponding biascurrent produced was measured. The relationship between bias voltage andi_(abc) is differentiable until i_(abc) approached zero as shown in FIG.17.

From these results, the following relationship between V_(abc), R_(abc)and i_(abc) was formed:

$\begin{matrix}{i_{abc} = {\frac{1}{2} \cdot \left( \frac{V_{abc} + 13.56}{R_{abc}} \right)}} & (8)\end{matrix}$

This approximation is close to the simulation results deviating only asthe bias current becomes very close to zero. This is due to a non-linearroll-off of the bias current when compared to the bias voltage applied.Reasonable accuracy is obtained from a ±10 Volt bias voltage.

The transconductance gain of a basic bipolar OTA is directlyproportional to the bias current and is quantified by [109]:

$\begin{matrix}{g_{m} = {{\frac{i_{abc}}{2 \cdot V_{T}} \cdot \sec}\mspace{14mu} {h^{2}\left( \frac{V_{i\; n}}{2 \cdot V_{T}} \right)}}} & (9)\end{matrix}$

The gain in (9) is not linear, typical applications linearize g_(m)around an operating point. The differential input of the OTA is limitedwithin very small range to ensure accuracy of this linear approximation.The LM13700 OTA gain is also a hyperbolic function although it improvesthe linearity through the use of linearizing diodes. These diodes allowa larger swing of input voltage while maintaining linear behavior. Thegain for this chip, factoring in the diodes, is approximated by (10) fora differential input voltage in the range of ±50 mV. This gain candirectly control the effective resistance in (7).

$\begin{matrix}{g_{m} \approx {{\frac{i_{abc}}{2 \cdot V_{T}} \cdot \sec}\mspace{14mu} {h^{2}\left( \frac{V_{i\; n} \cdot D}{2 \cdot V_{T}} \right)}}} & (10)\end{matrix}$

where V_(T)=26 mV and D is a diode linearization constant.

A simulation was conducted where V_(abc) was varied to change theeffective resistance while holding the line voltage constant. FIG. 18shows simulation and theoretical results from (7). The effectiveresistance was computed using (6). These results show thecontrollability of resistance over a wide.

More simulations were run to analyze the consistency of the resistance.While maintaining a constant bias current, the circuit was subjected tovarying line voltages. FIG. 19 shows multiple sweeps of the line voltagewith different bias currents. The plot shows line current vs. linevoltage. For an ideal resistor, the slope of the resulting line shouldbe constant.

As seen in FIG. 19, when the line voltage is between ±4 volts, theslope, or resistance, remains constant. Beyond that threshold the OTAbegins to saturate. FIG. 20 shows the same plot zoomed in on the linearrange.

The saturation of the circuit is dependant on two factors: the biascurrent and the throughput current. The effective resistance has avariance of ±2% with throughput current 40% or less of the bias current.Complete saturation occurs when the throughput current is equal to thebias current. Further examination of FIG. 20 reveals an offset in thecircuit behavior. The output of the circuit is nonzero when no voltageis applied to the input (one of the line terminals). This is problematicas the effective resistance of the circuit governed by Ohm's law isnonlinear. The computation of this resistance is shown in FIG. 21 andexhibits anomalous behavior.

The offset current present at the output of the OTA is almost entirelycaused by the internal offset voltage of the OTA. This can be modeled ina similar manner as offset voltages of traditional op-amps.Specifically, a voltage source (v_(off)) at the input of the device isamplified to the output producing the offset (i_(offset)). For an OTA,this is quantified by (11).

$\begin{matrix}{i_{offset} = {{v_{off} \cdot \sec}\mspace{14mu} {h^{2}\left( \frac{v_{i\; n}}{2 \cdot V_{T}} \right)}}} & (11)\end{matrix}$

A method for eliminating an OTA input offset voltage has been proposedin [110]. This method is suitable for fabrication and quantifies andeliminates the offset for any gain configuration. An offset rejection of40 dB was achieved in simulation [110]. With the recognition ofsufficient offset cancellation, the data was corrected in a mannerrepresenting the elimination of the offset quantified by (11) to analyzethe results with offset compensation. The corresponding results are muchbetter. FIG. 22 shows effective resistance versus line voltage forvarious bias current setting corrected for the offset. The asymptoticbehavior is eliminated and the resistance exhibits a variance ofapproximately 1% with a line voltage magnitude of 2.5 volts or less.

Hardware

The OTA based transmission line model was constructed in hardware usingthe LM13700 dual OTA and tested. Tests similar to the PSpice simulationswere conducted. The effective resistance of the circuit was calculatedfrom measurements while varying the bias voltage. The hardware exhibitssimilar controllable resistance as the software simulations with onlyslight deviation from simulation results as shown in FIG. 23. Analysisshows the offset voltage present in hardware is slightly higher, andsubsequently closer to the data sheet specifications, than the PSpicemodel.

Multiple hardware tests were conducted to analyze the saturation effectsof the circuit. The circuit was configured with a constant bias currentand subjected to a varying line voltage. FIG. 24 shows results of thesetests for eight different resistance configurations. The slope of plotsis indicative of the effective resistance and is constant for a limitedline voltage range before saturation sets in similar to the simulationresults. Approximately the same variance of ±2% for line currents at 40%or less of bias current was exhibited in hardware. FIG. 25 shows linecurrent and the linear characteristic of this line current plottedagainst line voltage. The response is very linear and deviates onlyslightly in this range. This linearity is exhibited in the effectiveresistance of the circuit.

The effective resistance is plotted against line voltage in FIG. 26.With the line voltage range between ±2.5 volts, the variance of theeffective resistance was less than 1%. This is an accurate range foroperation of this model. The line model becomes less accurate,saturation effects become more severe, as the line current magnitudeapproaches the bias current magnitude. The linear range for this model,defined by <1% variance, is quantified by the following relationship:

i _(line)≦0.3·i _(abc)  (12)

Within device limitations, the line voltage can be any magnitude as longas the relationship in (12) is maintained. This is advantageous as thelinear operating voltage range has been significantly increased from abasic open loop OTA with the use of feedback. This will allow easymeasurement of the bus voltages in an emulation network.

The results from both simulation and hardware were very similar andverify the operation, control and linear operating conditions of theproposed OTA based transmission line model. A three-bus prototype wasconstructed and tested based on these results.

Three-Bus Prototype

A network module prototype of a three-bus power system was built basedon the proposed transmission line model. This design was incorporated ona single circuit board containing the transmission line components ofthe four networks as shown in FIG. 10. It is suitable for both lossy andlossless models shown in FIGS. 11 and 12, respectively. The main purposeof this prototype was to validate the transmission line modelapplication in the previously proposed DC emulation power flow technique[102]. The configuration of the power system is shown in FIG. 27.

The power system consists of one load, two generators and threetransmission lines (A, B, C). Four independent networks containing threeOTA based variable resistors were used in this emulation. Thecombination of these networks constitutes the complete transmission linemodel. The circuit board contains only the line models with terminals toattach generator and load models. The complete circuit board is shown inFIG. 28. The lines (A, B, C) are highlighted in the picture to show therelationship to the power system along with the labeling of the fournetworks in relation to FIG. 10. The input terminals for generators,loads, and power supply are also shown. For simplistic operation withfewer power supplies, all the bias currents are driven by the +15 voltsupply. Configuration of the lines was accomplished by the respectivevalues of R_(abc). In addition, there are jumpers on the board thatconnect the transmission lines to the buses. Lines can be removed forcontingency analysis or the network can be reconfigured with thesejumpers. In addition, this design is scalable and subsequently largerpower systems can be emulated with interconnection of multiple boards.

The network module was configured for power flow emulation. DC voltagesources were used as generators (values were obtained from a computedsteady-state power flow solution) and the load was an OTA based constantcurrent source. This is shown in FIG. 29. While this setup requires theprior knowledge of the generator bus voltages, the appropriate voltageon the load bus corresponding to the correct load currents is sufficientto verify the functionality of the network module. The results werecompared to results from the PowerWorld® software package [111].

The network module was configured per the lossless system in PowerWorld®shown in FIG. 30. Table 9 shows the results, load bus voltage in perunit, from both the hardware (offset corrected) and the softwarecomputation.

TABLE 9 Power World and HW Emulation Results Load Bus VoltageRectangular Polar PowerWorld 0.78 + i0.25 0.816

17.48° HW Emulation 0.78 + i0.24 0.817

17.35°

indicates data missing or illegible when filed

The results in this case are very accurate. From multiple runs anddifferent load configurations the voltage magnitude error betweenPowerWorld® and HW was approximately 1.5% and the phase angle error wasabout 5%. The errors are caused mostly by deviations from OTA to OTA andresistor tolerances.

The transmission line model presented is sufficient for lossy andlossless line models for use in the DC emulation static power flowtechnique mentioned. The overall model consists of multiple OTA basedvariable resistor circuits. The shortcomings of the hardware prototype,such as linear range, accuracy and device offsets, are mostly due to thecurrent OTA technology. This can be overcome with further development ofbetter performing OTAs, which have been shown in prior research, in asimilar manner in which traditional op-amps have been developed. Thisproblem can also be tackled through a custom VLSI design.

An accurate, low-cost and remotely reconfigurable OTA based analogtransmission line model has been developed. Simulation and hardwareresults verified the design, clearly defined linear operating ranges andexamined the deficiencies of current off-the-shelf OTAs while pointingout methods of compensation. A three-bus power system was constructedbased on the proposed model and the resultant power flow resultscompared favorably to those from commercially available software. Thisexpands on the work presented in [101-103] by the introduction of ahardware based line model with fast, remote reconfigurability suitablefor the presented power flow emulation scheme.

In another preferred embodiment, simulation and/or emulation are usedwhen dealing with large-scale power systems. They make it possible to doessential assessment in power system dispatch, operation, security andstability. Different simulation/emulation tools today are built fordifferent applications like transient stability analysis, faultanalysis, power flow studies, operational planning, etc.

ABMs are used to make flexible descriptions of electronic components orcomplex building block in terms of transfer function or lookup tables.In order words, a mathematical relationship is used to model a circuitsegment so one will not need to design the segment component bycomponent. FIG. 31 shows a sine shaper which is a device that outputssine value of its input signal. The left side of FIG. 31 is a modifiedBarry Gilbert sine shaper [203] built component by component usinganalog devices, and the right side of FIG. 31 is an ABM sine shaper[201]. The only similarity between the two is that they both take thesame input signal and produce the same output.

One aspect of the invention employs transient stability analysis withthe advantages of shorter computation time than current digitalsimulations and smaller size and cost than discrete analog emulators.One of the main disadvantages of this method is the limited accuracy dueto the implementation of the analog VLSI chip.

Methodology of Core Computations Using ABMs

Analog Behavioral Models (ABMs) of PSpice are used to substitute andimplement mathematical equations and/or scaled relationship analogcircuit representations, which describes the states of power system andemulate its behaviors. This gives a clearer picture of the end goal,design layouts and intricacies of circuit connections.

Power system core computation methodology implemented in the design ofanalog emulation goes through three major steps. These steps areillustrated in FIG. 32. The assumptions made and detailed processes arediscussed in [211] and [202], respectively.

Step 1: Calculating Complex Voltage Out of Generator

The generator block solves the swing equation (dynamic) with resultingpower angle, δ. The power angle combines with desired voltage magnitudeto give a complex voltage out of a generator.

Step II: Calculating Currents in the Network

The nodal and generator induced voltages interact with transmission lineimpedance and load (if not lumped with line impedance) to producecomplex current flowing in any branch. Using admittance for simplicityand expressing in rectangular form, current is computed as in FIG. 32.As depicted in FIG. 33, it will require four separate networks in orderto emulate complex current flowing on any branch. The approach ofDC-resistive network as proposed by Fried et al. [211] is used.Transformation of the transmission line complex impedance to itsresistive value for real and imaginary part is through its complexconjugate as shown in equation 13:

$\begin{matrix}\left. \begin{matrix}\begin{matrix}{Y = {\frac{1}{Z} = {\frac{1}{R + {j\; X}} = \frac{R - {jX}}{R^{2} + X^{2}}}}} \\{Y_{Re} = {\frac{R}{R^{2} + X^{2}} = Y_{r}}}\end{matrix} \\{Y_{Im} = {{( - )\frac{X}{R^{2} + X^{2}}} = Y_{i}}}\end{matrix} \right\} & (13)\end{matrix}$

Combining equation 13 with FIG. 33, implementation of complex currentcomputation as described in FIG. 34 is shown. This means that the realpart of complex current flowing in any branch is given by the sum ofcomponent currents in networks 1 and 2, and the imaginary part as thedifference between component currents in networks 4 and 3, as describedin equation 14.

$\begin{matrix}\left. \begin{matrix}\begin{matrix}{I = {I_{r} + {j\; I_{i}}}} \\{I_{r} = {I^{I} + I^{II}}}\end{matrix} \\{I_{i} + I^{IV} - I^{III}}\end{matrix} \right\} & (14)\end{matrix}$

where subscripts r and i represent real and imaginary part,respectively. I, II, III, IV represent the four DC networks.

Step III: Computation of Generator Real Power and Swing Equation Update

The real power out of each generator in the system is calculated as thereal part of the product of generator terminal voltage and complexconjugate of the generator current:

  (15)

The swing equation describes the dynamics of a generator. Its input is amechanical power from a prime mover and which is assumed constant. Theoutput of a generator is an electrical power, which changes inaccordance with the state of the network. The swing equation is a secondorder differential equation and its solution is the power angle of thegenerator. The power angle combines with generator internal voltage toconstantly update the generator terminal voltage and hence the generatorcurrent. FIG. 35 describes the continuous cycle of power system corecomputation.

FIG. 36 describes the DC-network equivalent of a real power system busand conditions for its implementation. Analyzing a sample power systemand considering an arbitrary bus k with its complex voltage values,there are 4 networks with 4 buses K^(I), K^(II), K^(III), and K^(IV) asequivalents. For feasible bus voltages, the following conditions must besatisfied:

$\begin{matrix}\left. \begin{matrix}{V_{k}^{I} = {V_{k}^{III} = V_{k_{r}}}} \\{V_{k}^{II} = {V_{k}^{IV} = V_{k_{i}}}}\end{matrix} \right\} & (16)\end{matrix}$

The feasible bus voltages condition is realizable with a constantPQ-load model [202]. Among other attributes of an analog emulator thatcontribute greatly to its versatility are the scale factors. Scalefactors are constants that relate the values of the variables on theemulator to the values of the variables in the real system under study.

Case Studies and Results

Cases studied, which include 3-bus, 6-bus and 14-bus lossy and losslesspower systems, were implemented using ABMs of PSpice. ABMs substitutereal analog circuit implementation of the mathematical equations andscaled relationship that describe the states of the power system casesand emulate their behaviors. Power flow tests ere conducted on each ofthe cases. Similar cases were tested on industrial grade numericalsimulation software, PowerWorld® v9.1 [212] and Power System Simulationfor Engineering® (PSS/E) v.28.1 [213], for benchmarking. For thevalidation process, the following parameters were measured and/orcalculated and compared with the benchmarks.

i. δj—Voltage angle difference between generator buses j and 1 (j=2, 3,. . . );

ii. Load bus complex voltage;

iii. Complex current flowing in each branch.

Simulation results obtained from the analog emulation enginesimplemented on ABMs of PSpice and that of the benchmarks are presentedin Tables 10 through 13. The results compare favorably. This confirmsthe validity of the methodology as well as the technique ofimplementation.

TABLE 10 Summary Results for 3-Bus Power System (Lossless System)PowerWorld PSS/E Analog Emulator Relative Power Angle 0.94° 0.9° 0.94°(δ₁₁) in degree Load Bus (V3) Voltage 0.988

−3.17° 0.988

−3.2° 0.9880

−3.12°  (V_(in)) Current in Branch 2_1 343.1848 343 343.2

0.47°  (I, A) Current in Branch 1_3 784.49 784 784.5

−13.91° (I, A) Current in Branch 2_3 503.87 504 503.9

−10.70° (I, A)

indicates data missing or illegible when filed

TABLE 11 Summary Results for 3-Bus Power System (Lossy System)PowerWorld PSS/E Analog Emulator Relative Power Angle 0.65° 0.6° 0.65°(δ₂₁) in degree Load Bus (V3) Voltage 0.9675_(in)

−1.03° 0.968_(in)

−1° 0.9675_(in)

−1.03°  (V_(in)) Current in Branch 2_1 334.2612 334 334.6

45.30° (I, A) Current in Branch 1_3 691.3922 691 691.6

1.36°  (I, A) Current in Branch 2_3 654.2371 654 654.2

−23.1° (I, A)

indicates data missing or illegible when filed

TABLE 12 Summary Results for 6-Bus Power System (Lossy System)PowerWorld PSS/E Analog Emulator Relative Power Angle −2.10° −2.1°−2.10° (δ₂₁) in degree Relative Power Angle −2.71° −2.7° −2.71° (δ₃₁) indegree Load Bus (V4) Voltage 0.974

−7.35° 0.974

−7.3° 0.974

−7.35° (V_(in)) Load Bus (V5) Voltage 0.988

−4.69° 0.988

−4.7° 0.988

−4.69° (V_(in)) Load Bus (V6) Voltage 0.983

−6.36° 0.983

−6.4° 0.9829

−6.36°  (V_(in)) Current in Branch 1_2 344.86 345 344.86

−29.76° (I, A) Current in Branch 1_5 758.72 759 758.72

−20.98° (I, A) Current in Branch 2_5 388.84 389 388.84

−15.95° (I, A) Current in Branch 2_6 515.047 515 515.04

−11.7°  (I, A) Current in Branch 3_6 376.53 377 376.54

−32.34° (I, A)

indicates data missing or illegible when filed

TABLE 13 Summary Results for 14-Bus Power System (Lossless) PowerWorldPSS/E Analog Emulator Relative Power Angle −1.08° −1.1° −1.08° (δ₂₁) indegree Relative Power Angle −0.75° −0.8° −0.75° (δ₃₁) in degree RelativePower Angle −8.09° −8.1° −8.09° (δ₄₁) in degree Relative Power Angle−7.30° −7.3° −7.30° (δ₅₁) in degree Load Bus (V4) Voltage 0.989

−5.51° 0.989

−5.5°  0.989

−5.51°  (V_(in)) Load Bus (V5) Voltage 0.991

−4.75° 0.991

−4.8°  0.991

−4.75°  (V_(in)) Load Bus (V7) Voltage 0.984

−9.22° 0.984

−9.2°  0.984

−9.22°  (V_(in)) Load Bus (V9) Voltage  0.979

−11.15° 0.979

−11.2° 0.979

−11.15° (V_(in)) Load Bus (V10) Voltage  0.980

−11.68° 0.980

−11.7° 0.980

−11.68° (V_(in)) Load Bus (V11) Voltage 0.987

−10.7° 0.987

−10.7° 0.987

−10.70° (V_(in)) Load Bus (V12) Voltage  0.982

−12.42° 0.982

−12.4° 0.982

−12.88° (V_(in)) Load Bus (V13) Voltage  0.980

−12.88° 0.980

−12.9° 0.980

−12.88° (V_(in)) Load Bus (V14) Voltage 0.979

−11.2° 0.979

−11.2° 0.979

−11.20° (V_(in)) Current in Branch 1_5 311.63 312 311.63

8.52°   (I, A) Current in Branch 2_5 309.85 310 309.85

10.86°  (I, A) Current in Branch 5_4 262.50 263 252.50

12.84°  (I, A) Current in Branch 4_9 151.21 151 151.21

14.42°  (I, A)

indicates data missing or illegible when filed

Simulation outputs of analog emulator were measured in rectangular formand/or radians. Conversions to polar representation and/or degree weremade for easy comparisons with the benchmarks. Also, for convenience andefficient analog emulation implementation, certain power systemvariables and parameters were represented by other circuit variablesFIG. 37 summarizes variable representations used in this example. Sinceconversion ratio used between current-voltage or power-current is 1:1,obtained values remain representative in quantity.

Generator Model

The proposed circuit is based largely upon the classical generatormodel. This section presents a short review of the basics behind thegenerator model developed and scaling of the model to an appropriateanalog circuit. The classical generator model that will be used for thisexample consists of a voltage source behind an impedance, where V∠0represents the generator terminal voltage, E∠δ denotes the internalgenerator voltage and angle, and Z representing the transient reactanceof the generator ′dX. The generator consisting of these elements can beseen in FIG. 39.

With this model, the electrical angle, δ, is unknown, and therefore,needs to be computed. In order to determine the electrical angle, thedynamics of the generator were captured. The simplified mechanicalequation that governs the motion of the generator rotor, describes theswing between the mechanical and electrical angle, otherwise known asthe swing equation,

M

{dot over ({dot over (δ)})}+D{dot over (δ)}+P _(e)(δ)=P _(m)  (17)

where M is the generator inertia coefficient, D is the dampingcoefficient, P_(e) is the electrical power output, and P_(m) is themechanical input power. The solution we seek, the load flow, correspondsto the steady state solution of this equation.

In order to simplify the calculation required in the analog circuit, theelectrical power output of the generator will be solved in rectangularform. For converting the polar notation to rectangular form, only thereal and imaginary components will be necessary and the calculation forthe electrical output power of the generator model can be rewritten asthe following equation:

Re{S}=Re{V·I*}

P _(e) =E _(Real) ·I _(Real) +E _(Imag) ·I _(Imag)

P _(e) =|E|·cos δ·Re{I}+|E|·sin δ·IM{I}  (18)

The representation of the electrical output power will need to beapplied directly into the swing equation for the calculation of theelectrical angle. The functionality of the swing equation as seen in theblock diagram in FIG. 40, will provide us with a layout for thegenerator model that will be presented in the following section.

With many different models representing the generator available, we willuse the classical model while neglecting damping. This will simplify thenon-linear dynamics of the classical generator without significantlyreducing the precision of the model [305]. With these assumptions, theswing equation (17) will reduce to only the difference between themechanical input power and the electrical output power. After factoringin the generator inertia coefficient and taking two integration steps ofthe angular acceleration, the solution of the electrical power angle canbe calculated in the following form:

$\begin{matrix}{\delta = {\frac{1}{M}{\int{\int{\left( {P_{m} - {P_{e}(\delta)}} \right){t}{t}}}}}} & (19)\end{matrix}$

Scaling Parameters

In order to realize a power system using analog emulation, two differentscaling factors need to transpire. The time scaling parameter is for thecomputation speedup of the emulator while the scaling of parameter spaceis for the electrical angle to be represented as voltage in the circuit.Next is the introduction of the time scale factor, τ, a constant whichwill represent a computational speedup of τ times faster than thereal-time system. This factor is directly proportional to the real timeof the power system, t, for a given simulation time of the analogemulator, T. With this advantage, we can include the time scale factorin the solution of the electrical angle (19), capable of scaling thecomputation time of emulator in the following manner:

$\begin{matrix}{{\tau = \frac{t}{T}}{\delta = {\frac{\tau^{2}}{M}{\int{\int{\left( {P_{m} - {P_{e}(\delta)}} \right){t}{t}}}}}}} & (20)\end{matrix}$

The time scale factor needs to be squared because of the second ordernature involved in the dynamics of the classical generator model. Acollaborating factor that may influence the choice of the time scale isthat the error in the integration may be accentuated by long emulationtimes.

Another scaling procedure is actual parameter scaling of the electricalangle to voltage. Introducing the desired voltage, υ to be representedas a voltage level in the circuit that is equivalent to π radians, (20)can be rewritten as the following equation:

$\begin{matrix}{\delta = {\frac{\tau^{2}}{M}\frac{\upsilon}{\pi}{\int{\int{\left( {P_{m} - {P_{e}(\delta)}} \right){t}{t}}}}}} & (21)\end{matrix}$

With the generator model and scaling parameters defined, we can proceedto develop this model using CMOS analog devices. Before selecting thedevices we need to identify the circuit parameters needed to scale thepower system to an analog circuit and the parameters needed to achievereconfigurability within the model.

Reconfigurable Generator Model

The next step of modeling addresses the concept of reconfigurability inthe generator model. The actual analog circuit representing thegenerator needs to incorporate a programmable analog device into modelto obtain controllable behaviors of the power system.

Instead of the typical operational amplifiers to realize some of thecomponents needed, the operational transconductance amplifier (OTA) willbe used because of its ability to control its gain by an externalcurrent (or voltage).

The OTA has the basic properties of a voltage-controlled current source(VCCS) and the output current is obtained by:

I _(o) =g _(m)(ΔV _(in))  (22)

a product of the transconductance parameter and the differential inputvoltage. The transconductance, gm, can be increased or decreased as afunction of the input voltage by varying the amplifier bias current,I_(abc), according to this equation:

g _(m) =ρ·I _(abc)  (23)

as a gain and is dependent on a non-linear device constant, ρ and theamplifier bias current as seen in (23) [306]. Ideally, the desirablegain of an OTA would always be constant, with a linear input and outputrelationship. Since this is not the case, measures will need to beconsidered because of the errors that will result from the non-lineartransconductance of the OTA.

Circuit Parameter Transformation

To achieve the difference between the mechanical input power and theelectrical output power on a circuit, power is quantified as current.This introduces a circuit scaling parameter, Kp, a constant ratiobetween the circuit current and model power quantity as seen below.

$\begin{matrix}{K_{p} = \frac{I_{\max}}{P_{\max}}} & (24)\end{matrix}$

The analog circuit developed to solve the swing equation of thegenerator will consist of two first-order integrators built using OTAs[306]. The output of the double integration process is voltagerepresenting the electrical angle and is calculated in the followingmanner analogous to (21).

$\begin{matrix}{{\delta \equiv V} = {\frac{g_{m\; 1} \cdot g_{m\; 2}}{C_{1} \cdot C_{2}}{\int{\int{\left( {I_{m} - {I_{e}(V)}} \right){t}{t}}}}}} & (25)\end{matrix}$

In this equation, the I_(e)(V) represents electrical power which isdependent on the electrical angle. The components C₁ and C₂ are thevalues of the integration capacitor needed in the first and secondintegrators, respectively. The transconductance parameters, g_(m1) andg_(m2), are the gains of the first and second OTAs in the doubleintegration process. These values can be reconfigured using theamplifier bias currents to correspond to various generatorcharacteristics and a given range of speedup times.

Analog Generator Model

In order to provide the analog implementation of the algorithm describedin the previous section, an approach to simulate the block diagram ofthe swing equation (19) needs to be realized. A functional block diagramof the generator behavioral model is made up of voltage-controlledcurrent sources (VCCS), current-controlled voltage source (CCVS),amplifiers, integrators, and various other components that determine theelectrical output power of the generator. FIG. 41 represents the blockdiagram of the analog behavioral model of the generator.

Similar to I_(e), the current, I_(m), is used to quantify mechanicalpower. The difference between this current and I_(e) is used as theinput into the double integrator scaled as voltage. The voltage outputof the double integrator is equivalent to the electrical angle given by(21). By definition, electrical power output is dependent on generatorterminal voltage and current. Upon calculation of the electrical power,P_(e)(δ), it will be scaled to a corresponding current value, I_(e)(V),and the process updates itself until a possible stable steady-statesolution is attained.

To achieve this operation on a circuit, appropriate CMOS devices must beselected. Considerations used for selection include minimal device powerconsumption, suitable device operating ranges, and transconductance. Thedevices selected for this model will use commercial off-the-shelf (COTS)CMOS components including the National Semiconductor LM13700 OTA [307]and the Analog Devices AD639 Trigonometric Converter [308]. Having thedevices selected, an example of the analog emulator will be discussed,along with emulation results.

Power System Example

The model in FIG. 41 was used in a simple power system problem [39]. Forthis example, the generator delivered power to an infinite bus through atransmission line and the solution of the electrical angle can be seenas described in equation (19).

The calculation of the steady-state solution of the electrical angle ofthe generator can be found by solving the instantaneous generator powerfor the system,

$\begin{matrix}{{P_{m} = {\frac{{E}{V}}{\left( {X_{d}^{\prime} + X_{L}} \right)}\sin \; \delta^{\circ}}}{P_{m} = {P_{e,\max}\sin \; \delta^{\circ}}}} & (26)\end{matrix}$

where |E| is the internal generator voltage, |V| is the bus voltage,X′_(d) is the transient reactance of the generator, and X_(L) is theline reactance. For a given mechanical input power, the steady-statesolution of the electrical angle can be found with given systemparameters.

The analog circuit for the single-machine power system consisted of fourOTAs, a trigonometric converter, and resistors and capacitors as seen inFIG. 42. The controllable gains were calculated in order to fit thecircuit to the power system, which was dependent on the selection of theresistor and capacitors in the circuit. The selection of the resistorvalues were needed to scale the parameters to proper voltage levels forthe OTAs to function within their linear operating range. The capacitorselection determined the integration time of the circuit.

The gain representing the maximum electrical power output, P_(e,max), istransformed into the maximum current gain parameter A, which isequivalent to I_(e,max). The maximum allowable current parameter will becontrolled through the gain of an OTA, g_(mA). This can be changed fordifferent power system parameters in the following manner from equations(24) and (26).

$\begin{matrix}{{{A = {K_{p} \cdot P_{e,\max}}};{g_{m\; 1} = {A \cdot \frac{R_{1} + R_{2}}{R_{2}}}}}{g_{m\; 1} = \frac{K_{p}{E}{V}\left( {R_{1} + R_{2}} \right)}{\left( {X_{d}^{\prime} + X_{L}} \right)R_{2}}}} & (27)\end{matrix}$

The two gains of the OTAs within the double integrator are consideredidentical to one another. Using equations (21), (24), and (25), thegains can be characterized in terms of the following:

$\begin{matrix}{{M = \frac{H}{\pi \; f_{0}}}{g_{m\; 1} = {g_{m\; 2} = \sqrt{\frac{P_{m} \cdot C_{1} \cdot C_{2} \cdot \tau^{2} \cdot \upsilon}{K_{p} \cdot M \cdot R \cdot \pi}}}}} & (28)\end{matrix}$

For equation (27), the power system parameters are directly associatedwith a controllable gain parameter. In equation (28), the controllablegain is directly associated with the generator inertia constant and thescaling parameters defined in equations (20) and (21). This allows for acompletely reconfigurable classical generator model that providescomputational speedup for conventional digital methods.

The reconfigurable classical generator model has been tested in asingle-machine power system. The results of the simulation shown arefrom the circuit emulator of FIG. 42 using PSpice and will be comparedto simulation results obtained using PSpice Analog Behavior Models(ABM). This comparison is validated from previous work [310] conductedusing Analog Behavior Models of sample power systems, where the resultsobtained compared rather favorably with traditional load flow solverssuch as PSS/E and PowerWorld®.

The power system parameters and the associated circuit parameters basedon the previous equations for the simulation example are listed in Table14. For this example, the mechanical input power is fixed constant at0.5 per unit (p.u.) while the emulator solves for a steady-statesolution.

The results from the power system can be seen in FIG. 43 a and theresults of the analog emulator can be seen in FIG. 43 b. With bothconverging to approximately the same steady-state solution, the analogemulator solved from the electrical angle of the swing equation muchfaster than the actual power system. This clearly demonstrates thecapability of the computational speedup the analog emulator possesseswith respect to real-time.

TABLE 14 System and Circuit Parameters Power System Circuit Emulator |E|1.8 C₁ (ηF) 100 |V| 1.0 C₂ (ηF) 100 X_(d) (p.u.) 1.0 g_(m1) (mho) 0.0182X_(L) (p.u.) 0.4 g_(m2) (mho) 0.0182 H (sec) 5.0 K_(D) 1.0 · 10⁻⁴ P_(m)(p.u.) 0.5 ν (V) 3.6 f_(o) (Hz) 60 τ 5.0 · 10³ A (μA) 129 I_(m) (μA) 50g_(mA) (mho) 0.0130

Upon closer inspection, the results of the actual angular solution aredifferent with a large degree of error. The ABM power system modelsolved for the angular solution of 22.89°, while the OTA analog emulatormodel solved for the angular solution of 27.79°. The difference betweenthe two results has an error of approximately 21-22%. Since thesteady-state calculation of the actual electrical angle using equation(26) results in an angle of 22.89°, the analog emulator seems to producethe error.

Since the present invention focuses on the study of load flow analysisof power systems, and not the intricate design of electronic devices,the offset correction method that will be applied to this example willbe the post-processing of the data to correct for the offset of thedevices.

To correct for the offset, the circuit was set with zero mechanicalinput power, Pm=0. The circuit emulator provided an offset solutionbecause the gains amplify the offsets within the devices. This angularoffset solution was subtracted from the PSpice emulator simulation forthe given mechanical power. FIG. 44 shows the results of the offsetcorrection compared to the original results of the analog emulator withPSpice. With the offset eliminated from the PSpice solution, thecorrected angular solution is now 22.74°, which is much closer to thedesired solution. The results from this example and the errorscalculated are tabulated in Table 15.

TABLE 15 Result Comparison and Errors Method Delta (deg) Error (%)Calculation 22.185 — PSpice ASM 22.886 0.00 PSpice OTA 27.787 21.4PSpice OTA 22.740 0.63 Offset Correction

This single-machine infinite-bus power system example shows that theanalog emulator achieves computational speedups over the real powersystem. The disadvantage of the analog circuit is that the accuracy ofthe circuit may not be as good as current digital computer methods forsolving load flow. The results show with offset correction methods, anadvantageous tradeoff is made between accuracy and computationalspeedup.

The example provided results of the advantages of the analog emulatorincluding the ability to reconfigure system parameters and thecomputational speedups achieved in the circuit. The present invention isapplicable to many different areas. One method would be the introductionof initial conditions on the integrator to start at a value that may becloser to a desired stable solution. Also, more complicated models couldbe developed to expand this model to include damping and higher ordergenerator models.

As has been demonstrated above, the present invention is useful formodeling of power systems and power system components. This can be usedfor fail-safe planning, as well as for operations planning. Thus, thepresent system is flexible enough to allow long-term planning over thelife of the system, as well as to facilitate short-term operationsplanning for operating the system as little as 24 hours to 7 days aheadof time. This is possible since the present invention uses the hardwareup to about 10 times more efficiently than all-purpose computingplatforms.

The hardware of the present invention, including the FPGA and relatedcomponents, are flexible enough to be customized for a variety ofdifferent calculations. Thus, although the present invention has beenexemplified for power system planning, it is generally applicable toother systems requiring complex computations as well.

It is to be understood, however, that even though numerouscharacteristics and advantages of the present invention have been setforth in the foregoing description, together with details of thestructure and function of the invention, the disclosure is illustrativeonly, and changes may be made in detail, especially in matters of shape,size and arrangement of parts within the principles of the invention tothe full extent indicated by the broad general meaning of the terms.

REFERENCES

-   [1] Feng Tu and A. J. Flueck, A message-passing distributed-memory    parallel power flow algorithm, Power Engineering Society Winter    Meeting, 2002. IEEE, Volume 1 (2002), pp 211-216.-   [2] X. Wang and S. G. Ziavras, Parallel Direct Solution of Linear    Equations on FPGA-Based Machines, Workshop on Parallel and    Distributed Real-Time Systems (17th Annual IEEE International    Parallel and Distributed Processing Symposium), Nice, France, April    22-23, 2003.-   [3] Keith Underwood, FPGAs vs. CPUs: Trends in peak floating point    performance, ACM/SIGDA Twelfth ACM International Symposium on    Field-Programmable Gate Arrays, (Monterrey, Calif.), February 2004.-   [4] A. R. Bergen, V. Vittal, Power Systems Analysis, 2nd Edition,    Prentice Hall, 2000.-   [5] P. Amestoy, T. A. Davis, and I. S. Duff, An approximate minimum    degree ordering algorithm, SIAM J. Matrix Anal. Applic., 17 (1996),    pp. 886-905.-   [6] M. Yannakakis. Computing the minimum fill-in is NP-Complete,    SIAM J. Alg. Disc. Meth., 2 (1981), pp. 77-79.-   [7] W. H. Press, S. A. Teukolsky, W. T. Vettering and Brian P.    Flannery, Numerical Recipes in C: the Art of Scientific Computing,    2nd Edition, New York: Cambridge University Press, 1992.-   [8] T. A. Davis and I. S. Duff, An unsymmetric-pattern multifrontal    method for sparse lu factorization, SIAM J. Matrix Anal. Applic., 18    (1997), pp. 140-158.-   [101] R. Fried, R. S. Cherkaoui, and C. C. Enz, “Low-Power CMOS,    Analog Transient-Stability-Simulator for a Two-Machine Power    System,” presented at ISCAS, Hong-Kong, 1997.-   [102] R. Fried, R. S. Cherkaoui, C. C. Enz, A. Ger-mond, and E. A.    Vittoz, “Approaches for analog VLSI simulation of the transient    stability of large power networks,” Ieee Transactions on Circuits    and Systems I-Fundamental Theory and Applications, vol. 46, pp.    1249-1263, 1999.-   [103] R. Fried, R. S. Cheikaoui, C. C. Enz, A. Germond, and E. A.    Vittoz, “Analog VLSI Solution to the Stability Study of Power    Networks,” presented at ICECS, 1996.-   [104] S. P. Carullo, M. Olaleye, and C. O. Nwankpa, “VLSI Based    Analog Power System Emulator for Fast Contingency Analysis,”    presented at Hawaii International Conference on System Science,    Hawaii, 2004.-   [105] J. Ramirez-Angulo and I. Grau, “Wide gm Adjustment Range,    Highly Linear OTA with Linear Programmable Current Mirrors,”    presented at ISCAS, 1992.-   [106] Z. Wang, “Making CMOS OTA a Linear Transconductor,”    Electronics Letters, vol. 26, pp. 1448-1449, 1990.-   [107] C. A. Karybakas, C. Kosmatopoulos, and T. Laopoulos, “Improved    Temperature Compensation of Otas,” Electronics Letters, vol. 28, pp.    763-764, 1992.-   [108] N. Semiconductor, “LM13700 Dual Operational Transconductance    Amplifiers with Linearizing Diodes and Buffers,” 2000.-   [109] A. Gratz, “Operational Transconductance Amplifiers,”    Unpublished.-   [110] R. Wunderlich, J. Oehm, A. Dollberg, and K. Schumacher, “A    Linear Operational Transconductance Amplifier with Automatic Offset    Cancellation and Transconductance Calibration,” presented at    International Conference on Electronics, Circuits and Systems, 1999.-   [111] “PowerWorld Simulator 10.0,” PowerWorld Corporation, 2004.-   [201] Capture CIS version 9.2.3, Cadence Design Systems, Inc.-   [202] M. B. Olaleye and C. O. Nwankpa, “Analog Behavioral Models for    the Purpose of Analog Emulation of Large Scale Power Systems,” The    Proceedings of the 36th North American Power Symposium (NAPS), pp.    97-104, August 2004.-   [203] B. Gilbert, “A monolithic Microsystem for Analog Synthesis of    Trigonometric Functions and Their Inverses”, IEEE Journal of Solid    State Circuits, Vol. SC-17, No. 6, pp. 1179-1191, December 1982.-   [204] P. C. Krause, T. A. Lipo and D. P. Carroll, “Applications of    Analog and Hybrid Computation in Electric Power System Analysis,”    Proceedings of the IEEE, Vol. 62, No. 7, July 1974.-   [205] P. G. McLaren, P. Forsyth, A. Perks and P. R. Bishop, “New    Simulation Tools for Power Systems”, Transmission and Distribution    Conference and Exposition, 2001. IEEE/PES, vol. 1, 28, pp. 91-96,    October-2 Nov. 2001.-   [206] A. Greenwood, Electrical Transients in Power Systems. 2nd    Edition, John Wiley & Sons, Inc., New York, 1991-   [207] G. A. Korn and T. M. Korn, Electronic Analogue Computers (D-C    Analogue Computers), McGraw-Hill, New York, 2nd edition, 1956.-   [208] M. Hirakami and W. Neugebauer, “Transient Network Analyzer    Operation with Digital Computer Control and Analysis,” IEEE    Transactions on Power Apparatus and Systems, Vol. PAS-100, No. 4,    April 1981.-   [209] M. Crow, Computational Methods for Electric Power Systems. CRC    Press, 2003.-   [210] R. W. Newcomb and J. D. Lohn, “Analog VLSI for Neural    Networks” in Handbook of Brain Theory and Neural Networks, M. Arbib    Ed., Bradford Books, MIT Press, 1995.-   [211] R. Fried, R. S. Cherkaoui, C. C. Enz, A. Germond, and E. A.    Vittoz, “Approaches for Analog VLSI Simulation of the Transient    Stability of Large Power Networks,” IEEE Transactions on Circuits    and Systems-I: Fundamental Theory and Applications, Vol. 46, No. 10,    pp. 1249-1263, October 1999.-   [212] PowerWorld Simulator v.9.1, PowerWorld Corporation, Champaign    Ill.-   [213] Power System Simulator for Engineering (PSS/E)-28.1.4, Power    Technologies, Inc.-   [301] J. B. Ku and S. C. Lin, Low-Voltage SOI CMOS VLSI Devices and    Circuits. New York: John Wiley & Sons Inc, 2001.-   [302] R. Fried, R. Cherkaoui, C. Enz, A. Germond, and E. Vittoz,    “Approaches for analog VLSI simulation of the transient stability of    large power networks,” IEEE Transactions on Circuits and Systems    I-Fundamental Theory and Applications, vol. 46, pp. 1249-1263, 1999.-   [303] J. Gu, G. G. Karady, and R. Farmer, “Real-time analysis of    transient stability using reconfigurable analog VLSI,” IEEE    Transactions on Power Systems, vol. 18, pp. 1207-1209, 2003.-   [304] J. Yakaski and C. Nwankpa, “Insight into Reconfigurable Analog    Emulation of the Classical Generator Model,” presented at The 36th    Annual North American Power Symposium, University of Idaho, Moscow,    Id., 2004.-   [305] H. E. Brown, Solution of Large Networks by Matrix Methods. New    York: Wiley, 1985.-   [306] R. L. Geiger and E. Sanchez-Sinencio, “Active Filter Design    Using Operational Transconductance Amplifiers: A Tutorial,” in IEEE    Circuits and De-vices Magazine, vol. 1, 1985, pp. 20-32.-   [307] National Semiconductor, LM13700 Dual Operational    Transconductance Amplifiers with Linearizing Diodes and Buffers,    August 2000.-   [308] Analog Devices, AD639 Universal Trigonometric Function    Converter, October 1994.-   [309] A. Bergen and V. Vittal, Power System Analysis, 2nd ed:    Prentice Hall, 2000.-   [310] M. Olaleye and C. Nwankpa, “Analog Behavioral Models for the    Purpose of Analog Emulation of Large Scale Power Systems,” presented    at The 36th Annual North American Power Symposium, University of    Idaho, Moscow, Id., 2004.-   [311] R. Wunderlich, J. Oehm, A. Dollberg, and K. Schumacher, “A    Linear Operational Transconductance Amplifier with Automatic Offset    Cancellation and Transconductance Calibration,” presented at The 6th    IEEE International Conference on Electronics, Circuits and Systems,    University of Patras, Greece, 1999.

1. An apparatus for carrying out complex computations which comprises: afield programmable gate array, said field programmable gate arrayprogrammed for implementation of Gaussian elimination by decompositionof a matrix into lower and upper triangulation factors and forward andbackward elimination; memory operably connected to said fieldprogrammable gate array, and a device for maintaining pivotinginformation operably connected to said field programmable gate array. 2.Apparatus as claimed in claim 1, wherein said device for maintainingpivoting information comprises a lookup table and memory pointers. 3.Apparatus as claimed in claim 2, wherein said field programmable gatearray implements the following Gaussian elimination for each column inthe matrix loop: (a) search the column for a pivot element; (b) pivot ifnecessary; (c) normalize a pivot column; (d) update a remaining portionof the matrix; (e) multiply to form a sub-matrix update; and (f) updatea value or create a fill-in.
 4. Apparatus as claimed in claim 3, whereinsaid apparatus is configured to simultaneously perform memory reads andmemory writes between said field programmable gate array and saidmemory.
 5. Apparatus as claimed in claim 4, further comprising a randomaccess memory for storing an array of the indices of non-zero elementsin the matrix, said random access memory being operably connected tosaid field programmable gate array.
 6. Apparatus as claimed in claim 5,further comprising index rejection logic to select appropriate indicesfrom said stored array of indices.
 7. Apparatus as claimed in claim 6,wherein said memory comprises a row cache of memory of sufficient sizeto store a largest sub-matrix update for said computation.
 8. Apparatusas claimed in claim 7, wherein further comprising a table to keep trackof elimination progress.
 9. Apparatus as claimed in claim 1, whereinsaid complex computation involves solution of at least one sparse linearsystem.
 10. Apparatus as claimed in claim 1, configured to emulate powerflow in a power transmission system.
 11. A method for carrying outcomplex computations which comprises the steps of: (a) implementingGaussian elimination by decomposition of a matrix into lower and uppertriangulation factors and forward and backward elimination; (b) storingand retrieving matrix values in a memory; and (c) maintaining pivotinginformation.
 12. A method as claimed in claim 11, wherein said step ofimplementing Gaussian elimination comprises the steps of: Gaussianelimination for each column in the matrix loop: (a) searching a columnof the matrix for a pivot element; (b) pivoting, if necessary; (c)normalizing a pivot column; (d) updating a remaining portion of thematrix; (e) multiplying to form a sub-matrix update; and (f) updating avalue or create a fill-in.
 13. A method as claimed in claim 12, whereinsaid step of storing and retrieving matrix values in memory comprisessimultaneously storing and retrieving matrix values.
 14. A method asclaimed in claim 13, further comprising the step of tracking eliminationprogress to track elements that have already been eliminated.
 15. Amethod as claimed in claim 14, further comprising the step of selectingindices from said stored array to avoid selection of matrix elementsthat have already been eliminated.
 16. A system for emulating power flowin a power transmission system, which comprises: a power supply; avariable resistor including at least one reconfigurable operationaltransconductance amplifier operably connected to said power supply; anda device for measuring a current or voltage output of said system.
 17. Asystem as claimed in claim 16, comprising at least two reconfigurableoperational transconductance amplifiers for modeling power transmissionin at least two directions.
 18. A method for emulating a powertransmission system comprising the steps of: (a) calculating a complexvoltage from at least one generator using a dynamic swing equation; (b)calculating a complex current flowing in a branch of said powertransmission system; and (c) calculating a real power of said at leastone generator; and (d) updating the dynamic swing equation based on aresult of step (c).
 19. A method as claimed in claim 18, furthercomprising the step of: (e) solving for a steady-state solution.
 20. Amethod as claimed in claim 19, further comprising the step of: (f)correcting for offset of said steady-state solution by calculating anoffset solution for zero mechanical input power, and subtracting theoffset solution from said steady-state solution.